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REJECTIONS UNDER 35 U.S.C. §112 



Claim 39 stands rejected under 35 U.S.C. §112, second 
paragraph, as allegedly indefinite for failing to provide a 
definition of histone analogs or to define the genus of histone 
analogs. In this regard, the Office Action asserts that it is 
unclear which proteins would be considered a histone analog. 

Applicant submits that the meaning of the term is 
clear in light of the description in the specification as well 
as that which is well known in the art. Histones are well known 
in the art. They have been the subject matter of many studies 
over the years and can be found described in many molecular or 
cell biology text books. Exemplary publications supporting that 
they are well known in the art include, for example, Meyers, 
R.A., Molecular Biology and Biotechnology, A Comprehensive Desk 
Reference, VCH Publishers, Inc., 413-17, (1995), and Lodish, H. 
et al.. Molecular Cell Biology, Scientific American Books, 3rd 
Ed., 315-16, 346-348, (1995), attached hereto as Exhibits A and 
B, respectfully. 



Further, the specification describes, for example, at 
pages 44-50 the use of histone H2B in various studies to label 
chromosomes by expression. Within these descriptions, histone 
H2B is labeled at either the amino- or carboxy- terminus with, 
for example, GFP (green fluorescent protein) . Further described 
is the association of histone H2B in nucleosomes and its 
relationship as a H2A/H2B dimer with histone HI and histone 
H3/H4 tetramer. Accordingly, the specification sufficiently 
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supports the use of labeled histones in the method of the 
invention. 

Similarly, amino acid and polypeptide analogs also are 
well known in the art. For example, amino acid analogs include 
modified forms of naturally and non-naturally occurring amino 
acids. Naturally occurring amino acids include the 20 (L) -amino 
acids utilized during protein biosynthesis as well as others 
such as 4-hydroxyproline, hydroxy lysine, desmosine, 
isodesmosine, homocysteine, citrulline and ornithine, for 
example. Non-naturally occurring amino acids include, for 
example, (D) -amino acids, norleucine, norvaline, p- 
f luorophenylalanine, ethionine and the like. Modifications can 
include, for example, substitution or replacement of chemical 
groups and moieties on the amino acid or by derivitization or 
alternative synthesis of the amino acid. 

Specific examples of amino acid analogs can be found 
described in, for example, Roberts and Vellaccio, The Peptides: 
Analysis, Synthesis, Biology, Eds. Gross and Meinhofer, Vol. 5, 
pp. 341-358, Academic Press, Inc., New York, New York (1983), 
which is attached hereto as Exhibit C . Other examples include 
peralkylated amino acids, particularly permethylated amino 
acids, which can be found described in, for example. 
Combinatorial Chemistry, Eds. Wilson and Czarnik, Ch. 11, pp. 
235-237, John Wiley & Sons Inc., New York, New York (1997), 
attached as Exhibit D . Yet other examples include amino acids 
whose amide portion (and, therefore, the amide backbone of the 
resulting peptide) has been replaced, for example, by a sugar 
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ring, steroid, benzodiazepine or carbo cycle. An exemplarly 
description of these analogs can be found described in, for 
example. Burger's Medicinal Chemistry and Drug Discovery, Ed. 
Manfred E. Wolff, Ch. 15, pp. 619-627, John Wiley & Sons Inc., 
New York, New York (1995) , attached as Exhibit E. 

In light of the teachings and guidance in the 
specification as well as the well known meaning in the art, 
Applicant maintains that the objected term is sufficiently clear 
to allow those skilled in the art to practice the invention as 
claimed. Accordingly, withdrawal of this ground of rejection is 
respectfully requested. 

REJECTIONS UNDER 35 U.S.C. §103 

Claims 33-38, 42-50, 53 and 54 stand rejected under 35 
U.S.C. § 103(a) as allegedly obvious over Robinett et al. in 
view Abken et al. Robinett et al. is stated to describe a 
method for visualizing chromosomes by expressing a GFP-lac 
repressor-nuclear localization signal fusion protein. Abken et 
al. is stated to describe extrachromosomal DNA and double minute 
DNA as being chromosomal in origin. The Office Action alleges 
that it would have been obvious to use the visualization method 
described by Robinett et al, in a method for identifying agents 
that decrease or increase double minute chromosome formation. 
The rational provided for increasing double minute chromosome 
formation is allegedly because they are associated with 
carcinogenesis . 
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To establish a prima facie case of obviousness, the 
Office must show that the prior art would have suggested the 
claimed device to one of ordinary skill in the art and that it 
could have been carried out with a reasonable likelihood of 
success when viewed in the light of the prior art. Brown & 
Williamson Tobacco v. Philip Morris, 229 F.3d 1120, 1124 (Fed. 
Cir. 2000) . The first requirement of this test is at issue in 
the claimed invention because the Office Action simply asserts 
that it would have obvious to use chromosome visualization to 
identify agents that decrease the amount of double minute (DM) 
chromosomes. Further, the reasoning that identifying agents 
which increase DM chromosomes because they are associated with 
carcinogenesis is unclear. The Office has failed to show that 
such general conclusions are supported by the cited art. 

Establishing that the prior art would have suggested 
the claimed device requires an underlying factual showing of a 
suggestion, teaching, or motivation to combine the prior art 
references and is an "essential evidentiary component of an 
obviousness holding." Brown & Williamson Tobacco, 229 F.3d at 
1124-25 (quoting C.R. Bard, Inc. v. M3 Sys. , Inc., 157 F.3d 
1340, 1351-52 (Fed.Cir.1998) ; see also C.R. Bard at 1351 
(obviousness requires some suggestion, motivation, or teaching 
in the prior art where to select the components that the 
inventor selected and use them to make the new device) ; In re 
Kotzab, 217 F.3d 1365, 1370 (Fed. Cir, 2000) (there must be some 
motivation, suggestion or teaching in the prior art of the 
desirability of making the specific combination that was made by 
the applicant) . The evidentiary showing must be clear and 




Inventors : Vs?Mhl et al . 
Serial No. : 09/229,229 
Filed: January 12, 1999 
Page 6 



particular and broad conclusory statements about the teachings 
of the cited references, standing alone, are not "evidence." 
Brown & Williamson Tobacco, 229 F.3d at 1125 {quoting In re 
Dembiczak, 175 F.3d 994, 1000 (Fed. Cir. 1999) , abrogated on other 
grounds by In re Gartside, 203 F.3d 1305, 53 USPQ2d 1769 
(Fed. Cir. 2000) ) . 



In the pending Office Action, there has been no 
underlying factual showing that it would have been obvious to 
one of ordinary skill in the art to have modified the alleged 
visualization method of Robinett et al . with the description of 
Abkin et al. to obtain the claimed screening method. The Office 
has failed to point to clear and particular language suggesting 
use of any method to screen for agents that alter the amount of 
chromosomal DNA much less DM DNA. Robinett et al. appears to be 
directed to chromosomal visualization methods. Further, 
Robinett et al . states that future applications of their method 
should "facilitate structural, functional, and genetic analysis 
of chromosome organization, chromosome dynamics, and nuclear 
architecture" (abstract, last sentence) . These suggested future 
applications do not mention screening, and as such, Robinett et 
al. appears to be unconcerned with screening. Therefore, the 
assertion in the Office Action appears to be nothing more than a 
conclusory statement, unfounded by supporting evidence. 
Accordingly, the Office has not established its burden that the 
showing of a suggestion, motivation or teaching of the claimed 
combination must be clear and particular. 
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One purpose of the evidentiary requirement for showing 
a suggestion, motivation or teaching of the claimed combination 
is to prevent impermissible hindsight reconstruction of the 
claimed invention based on Applicant's own disclosure. C.R. 
Bard, 157 F.3d at 1352; In re Dembiczak, 175 F.3d 994, 999 
(" [c] ombining prior art references without evidence of such a 
suggestion, teaching, or motivation simply takes the inventor's 
disclosure as a blueprint for piecing together the prior art to 
defeat patentability - the essence of hindsight") . In 
determining the validity of patented biopsy needle assembly over 
the sole assertion that it arose from obvious adaptations of a 
single prior art needle assembly to accommodate a new biopsy gun 
design, the court admonished against hindsight reconstruction 
when it stated: 




The invention that was made, however, does 
not make itself obvious; that suggestion or 
teaching must come from the prior art. 
See, e.g., Uniroyal, Inc. v. Rudkin-Wiley 
Corp., 837 F.2d 1044, 1051-52, 5 USPQ2d 
1434, 1438 (Fed. Cir. 1988) (it is 

impermissible to reconstruct the claimed 
invention from selected pieces of prior art 
absent some suggestion, teaching, or 
motivation in the prior art to do so) ; 
Interconnect Planning Corp. v. Fell, 774 
F.2d 1132, 1143, 227 USPQ 543, 551 
(Fed. Cir. 1985) (it is insufficient to select 
from the prior art the separate components 
of the inventor's combination, using the 
blueprint supplied by the inventor) ; 
Fromson v. Advance Offset Plate, Inc., 755 
F.2d 1549, 1556, 225 USPQ 26, 31 
(Fed. Cir. 1985) (the prior art must suggest 
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to one of ordinary skill in the art the 
desirability of the claimed combination) . 

The court went on to conclude that because no prior 
art provided a teaching, suggestion or motivation for the 
structure of the claimed needle assembly there was, as a matter 
of law, an absence of an essential evidentiary component for an 
obviousness finding. C.R, Bard at 1352. 



Similarly, here, the Office Action has taken 
Applicants' own teachings and used it against them without 
additional support that the prior art would have suggested, 
motivated or taught one of ordinary skill to make the claimed 
combination. As describe above, Robinett et al . appears to have 
been unconcerned with methods to screen for agents that alter 
the amount of DM DNA. Similarly, Abken et al . also does not 
suggest screening for agents that alter the amount of DM DNA. 
Instead, Abken et al . is alleged to describe that DM DNA is 
chromosomal in origin and that it may cause disregulation of 
cancer cell growth. 



The Office Action neither cites art showing a 
combination of chromosome visualization with screening methods 
nor cites to text in the cited references that provide a 
suggestion, motivation or teaching to achieve the claimed 
combination. The alleged rational fails to support any 
motivation because there is no evidence that either Robinett et 
al. or Abken et al. considered screening for agents of any kind. 
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Reliance on "common knowledge and common sense" to 



fill the void for the required showing of a suggestion for a 
claimed combination of elements does not substitute for the 
obligation to cite references to support an obvious conclusion. 
In re Thrift, 298 F.3d 1357, 1364 (Fed. Cir. 2002). 
Consequently, such a lack of an evidentiary showing is nothing 
more than impermissible hindsight reconstruction based on 
reading Applicant's own invention and reliance on unsupported 
conclusory statements. Applicants therefore respectfully 
request that the rejection of claims 33-3 6, 39-50, 53 and 54 be 
withdrawn. 



In light of the Remarks herein, Applicant submits that 
the claims are now in condition for allowance and respectfully 
request a notice to this effect. Should the Examiner have any 
questions, she is invited to call the undersigned attorney. 



CONCLUSION 



Respectfully submitted. 



September 10, 2003 
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fies either because of lack of informative markers or because 
Certainties about when the hemophilia mutation had arisen in 
^family. 

In hemophilia B, the development of rapid methods for detecting 
virtually all hemophilia B mutations now allows diagnoses based 
on the direct detection of the gene defect and ensures success in 
virtually every family (Figure 3b). In the United Kingdom a national 
L strategy is being implemented for the provision of genetic counsel- 
^ iflQ. This entails the construction of a national confidential database 
■ of mutation, hematological, and pedigree information that can be 
; used to provide carrier and prenatal diagnosis to the blood relatives 
' of the patients listed in the database by examination of the region 
of the gene defective in the index patient. This allows precise, rapid, 
and economical diagnoses. Similar developments in hemophilia A 
may occur later, in spite of the size and complexity of the factor 
^ Vlll gene. The inversion mutations involving intron 22 are now 
the easiest to identify. Rapid methods begin to be available for the 
detection of the remaining hemophilia A mutations. 

See also Genetic Testing; Human Disease Gene Map- 
ping. 
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Histones 
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Andre /. van Wijnen 

Key Words 

Cell Cycle The interval between the completion of mitosis in the 
parent cell and the completion of the next mitosis in one or 
both progeny cells. The periods of the cell cycle are sequen- 
tially defined as mitosis (prophase, metaphase, anaphase, and 
telophase), G, (the period between the completion of mitosis 
and the onset of DNA replication), S phase (the period of the 
cycle during which DNA rephcation occurs), and G2 (the 
period between the completion of DNA replication and the 
onset of mitosis). 

Histone Proteins Five principal species of basic chromosomal 
proteins designated H2a, H2b, H3, H4, and HI, which range 
in size from LI ,000 to 25,000 Da. Histone proteins complex 
with DNA to form the primary unit of chromatin structure, 
the nucleosome. 

Nucleosome The primary unit of chromatin structure in eukaryo- 
tic cells, consisting of approximately 200 nucleotide base pairs 
of DNA and two each of the core histone proteins (H2a, H2b, 
H3, and H4). 

Posttranscriptional Control The components of gene expres- 
sion involving regulation mediated at the level of messenger 



RNA processing within the nucleus and/or cytoplasm, the 
translatability and/or stability of mRNA, or the assembly or 
posttranslational modifications of polypeptides. 

Promoter Regulatory Elements DNA sequences, generally but 
not necessarily, 5' (upstream) from the mRNA transcription 
initiation site, which modulate, the specificity and/or level 
of transcription. 

Transcriptional Control The component of gene expression in- 
volving the synthesis of RNA, utilizing DNA as a template. 



Histones are positively charged nuclear proteins that are ubiqui- 
tously represented in eukaryotic cells for packaging DNA into the 
protein-DNA complex termed chromatin. Histone-DNA com- 
plexes form the primary unit of chromatin su*ucture, the nucleo- 
some. Modifications in the interactions of histones with DNA in 
specific regions of genes occur in association with changes in gene 
expression. Mammalian and nonmammalian histone genes have 
been cloned and characterized with respect to the regulation of 
expression. The histone genes are a multigene family, and most 
are expressed in proliferadng cells at the time in the cell cycle 
when DNA is replicated, providing histone proteins to package 
newly replicated DNA into chromatin. Other histone genes are 
expressed postproliferatively to support structural and transcrip- 
tional requirements of specialized cells. Regulatory sequences of 
histone genes, which determine the specificity of levels of transcrip- 
tion, as well as factors that bind to regulatory elements to mediate 
histone gene expression, have been identified. 

1 GENERAL CHARACTERISTICS 

1 . 1 The Biological and Structural Properties of 
Histone Proteins 

There are five principal species of histone proteins, designated 
H2a, H2b, H3, H4, and HI, ranging in size from 1 1,000 to 25,000 
Da. They are positively charged, as a result of high contents of 
the basic amino acids arginine, lysine, and histidine, which facilitate 
the interactions of histones with negatively charged DNA mole- 
cules. The amino acid sequences of the histone proteins have been 
highly conserved during evolution, reflecting the conserved role 
of these proteins in chromatin structure and the apparendy stringent 
requirement to support conservation of the primary unit of chroma- 
tin structure, the nucleosome. 

The histone proteins are encoded in a multigene family with 
multiple (e.g., approximately 20 copies in human cells), nonidenti- 
cal copies of each core (H2a, H2b, H3, and H4) and HI gene. The 
histone polypeptides can be separated into the following categories: 

1. Those that are represented in most cells and tissues and syn- 
thesized only in proliferating cells at the time of DNA synthe- 
sis (> 90%). 

2. Those that are found in many cells and tissues but are ex- 
pressed independently of proliferation, either constitutively 
during the cell cycle or following the completion of prolifera- 
tion at the onset of tissue-specific gene expression associated 
with differentiation, 

3. Those that are expressed solely in specialized cell types, such 
as spermatocytes and avian erythrocytes, in which there are 
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highly specific requirements for modifications in the packag- 
ing of DNA into chromatin. In lower eukaryotes, and appar- 
ently only in these organisms, there are multiple copies of 
the histone genes, providing large quantities of "stored" 
hi stone mRNA in oocytes that can support histone protein 
synthesis during the rapid series of initial cell divisions that 
immediately follows fertilization. 

Additional heterogeneity of the histone proteins is reflected by 
posttranslational modifications that include acetylation, methyla- 
tion, phosphorylation, and adenosine diphosphate (AD?) ribosyia- 
tion. Such modifications alter the distribution of charge in specific 
domains of the histone proteins and, together with hydrophobic 
bonding, may influence histone-DNA as well as histone-histone 
interaction. These posttranslational modifications are involved in 
the incorporation of newly synthesized histones into chromatin and 
may provide a basis for changes in the interactions of histones 
with DNA for remodeling chromatin architecture: for example, in 
condensation of chromatin into discrete chromosomes at the onset 
of mitosis, and in modification of chromatin structure when the 
expression of specific genes is activated or repressed. These 
changes in histone-mediated chromatin structure are rapid and re- 
versible, supporting cellular responsiveness to a broad spectrum 
of physiological signals that mediate transcription of cell growth, 
housekeeping, and tissue-specific genes. 

1.2 The Contribution of Histones to 
Chromatin Structure 

There is a requirement for the ordered packaging of 2,5 yards 
of DNA within the confines of the mammalian cell nucleus. To 



accommodate diis DNA packaging into nucleosomes, every 
cleus contains approximately 300 million histone molecules. Each 
nucleosome consists of a core panicle of approximately 140 base 
pairs of DNA wound around a complex consisting of two H2a, 
H2b, H3, and H4 molecules and a linker DNA region of approxj 
mately 40-60 base pairs (Figure 1 ). Under the electron microscope 
the nucleosomes appear as a series of beads (protein-DNA com- 
plexes) on a string (linker DNA joining the nucleosomes). The Hi 
histones bind to the linker region and participate in nucleosome^ 
nucleosome interactions. This organization accounts for only a 10 
nm chromatin fiber and a packing ratio of 7. The 10 nm beads- 
on-a-string structures are packed as a 30 nm chromatin fiber, and 
further packaging results in chromatin fibers of 100 nm. ''Nonhis- 
tone," sequence-specific DNA binding proteins can mediate DMA 
conformation and modulate histone-DNA interactions. Thus, it 
appears that the contributions of histones to transcriptional regula- 
tion are through facilitation of conformational properties of DNA 
that are responsive to gene-specific transcription factors. 



2 EXPRESSION OF HISTONE GENES 

A functional, as well as temporal, relationship between DNA repli- 
cauon and the expression of mammaUan core and HI histone genes 
was initially indicated by the constant histone-DNA ratio (1:1) 
observed in a broad spectrum of cells, tissues, and organs, and by 
the doubling of cellular levels of histone protein during the S phase 
of the cell cycle. Direct measurements then confirmed that histone 
protein synthesis is largely confined to S phase and that inhibition 
of DNA replication results in a rapid cessation of histone protein 
synthesis. The cellular levels of histone mRNA reflect cellular 




Nucleosomes = DNA & core histones (H2A/H2B/H3/H4) 




30 nm chromatin fiber = nucleosomes & linker histone (H1 ) 

Figure 1. Three principal levels of chromatin organization. Top: The 2 nm, deproteinized, double-stranded DNA double helix. Middle: The organization 
of DNA into nucleosomes. The beads-on-a-suing smicture comprises a 10 nm fiber; each bead consists of two each of core histone proteins (H2A. H2B, 
H3, and H4). The suing component of the suoicture is the DNA. Bottom: The higher order organization of chromatin structure mediated by association of 
nucleosomes through linker histone HI into a 30 nm chromatin fiber. 
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gvels of both histone protein synthesis and DNA replication. Simi- 
arly inhibition of DNA replication brings about a dose-dependent 
OSS (selective destabilization) of histone mRNAs, which parallels 
decreases in DNA and histone synthesis. Measurements of histone 
tiene transcription indicate enhanced synthesis of histone mRNAs 
early during the S phase of the cell cycle. 

The increased transcription of histone genes early during S phase 
and the coordinate accumulation of histone mRNAs for core and 
HI histone proteins that closely parallels the initiation of DNA 
and histone protein synthesis suggest that the onset of histone gene 
expression is at least in part transcriptionally mediated. Throughout 
S phase, the synthesis of histone proteins is modulated by the 
availability of histone mRNAs. The stabilization of histone mRNAs 
throughout S phase and the destabilization of histone mRNAs when 
DNA replication is completed or inhibited are highly selective, 
and largely posttranscriptionally controlled. At the onset of differ- 
entiation in mammalian cells, the histone genes that are under cell 
cycle regulation are down-regulated transcriptionally. When DNA 
replication is completed during the terminal cell cycle, histone 
protein synthesis ceases, histone mRNA is degraded, and both basal 
and enhanced levels of histone gene transcription are abrogated. 

3 ORGANIZATION AND REGULATION OF 
HISTONE GENES 

3.1 Organization of Cell-Cycle-Regulated 
Histone Genes 

In mammalian cells, the cell-cycle-regulated histone genes are 
organized into clusters of core alone (H2a, H2b, H3, and H4) or 
core together with HI histone coding sequences (Figure 2). Within 
these clusters, which are represented on at least two chromosomes, 
there is generally a pairing of H2a with H2b genes and H3 with 
H4 genes. In lower eukaryotes such as sea urchin and Drosophila, 
a similar organization is found for the cell-cycle-regulated genes 
encoding somatic cell histone proteins. However, the histone genes 



expressed during oogenesis in these organisms are organized as 
simple, tandemly repeated clusters that contain one of each of the 
five types of histone gene. 

Despite the clustering of cell-cycle-regulated histone genes, each 
histone coding sequence is an independent transcription unit with 
a unique promoter and mRNA coding sequence. All amino acids 
of the histone protein are encoded in contiguous nucleotides be- 
cause these genes lack introns. Also noteworthy are the absence 
of a polyadenylation site and the presence of sequences with hy- 
phenated dyad symmetry that form a stem-loop structure in the 
3' region as well as nontranslated leader and trailer segments of 
the mRNA that are less than 50 nucleotides long. 



3.2 Promoter Elements and Transcription Factors 
That Regulate Cell-Cycle-Dependent 
Histone Gene Expression 

Figure 3 is a schematic representation of the regulatory organization 
of the initial thousand base pairs of an H4 histone gene promoter. 
While this region contains the minimal sequences required for 
regulated expression, the functional limits of the H4 gene appear 
to extend considerably upstream. Indeed, cis-acting elements up 
to -6.5 kB may influence developmental expression of the H4 
histone gene in vivo in transgenic animals. Two domains of in 
vivo protein-DNA interactions for the H4 histone gene have been 
established in the intact cell at single nucleotide (nt) resolution. 
These have been designated H4-site I (nt - 156 to - 1 13) and H4- 
site II (nt -97 to -47). The proximal promoter domain H4-site I 
is a bipartite cis-activating element that interacts distally with a 
member of the ATF family of transcription factors, and proximally 
with the GC box binding protein (Spl) HiNF-C. These factors are 
capable of mediating a fivefold stimulation of transcription. The 
H4-site II domain represents a mosaic of functional recognition 
sequences that contribute to H4 gene transcription. H4-site II is a 
multipartite protein-DNA interaction site for sequence-specific 
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Figure 2. The organization of genomic DNA segments containing some of the human histone coding sequences; Arrows designate directions of transcription. 
H2B and H2A pseudogenes are designated by the symbol ^. 
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Figure 3. Schematic representation of promoter regulatory elements and transcription factors that support h.stone gene ^^P-^^^^"' ^^^.^^^^^^ Td " 
and organization of gene regulatory sequences is designated by sites I-IV. The ovals and boxes represent transcnption factors Proxnnal and dis a eel 
1 re^^^^^^^^^ eLents ar designated along with nuclease sensitive regions (DNase HS, MNase HS), Also shown are s.tes of histone gene mter cuo 
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elements both during the cell cycle and following differentiation. These mcdificatons in protem-DNA mteracaons control the extent to which the histone 
gene is transcribed, which is indicated by the thickness of the horizontal arrows over the mRNA regions of the gene. 



factors HiNF-D, HiNF-M, and HiNF-P (H4-TF2). The proximal 
region of H4-site II spans a TATA motif and is sufficient to mediate 
accurate transcription initiation in vivo. However, the distal region 
of H4-site II influences transcriptional competency, as well as the 
timing and extent of H4 mRNA synthesis in vivo. This site II distal 
region contains several distinct sequence motifs that either stimulate 
the basal level of H4 gene transcription (C box) or influence peri- 
odic levels of transcription (M box). The distal activating elements 
H4-sites III and IV encompass regions that stimulate transcription 
in vivo and interact with the heteromeric nuclear factors H4UA- 
1 and H4UA-3, respectively. Additionally, H4-site IV overiaps 
with a putative nuclear matrix attachment site spanning nt -730 
to -589. This element interacts with a sequence-specific nuclear 
matrix protein (NMP-1), and may influence expression of the H4 
histone gene promoter by transient anchorage to the nuclear matrix. 
The integration of mechanisms controlling the coordinately regu- 
lated transcription of multiple histone genes may involve several 



shared promoter-binding activities, including both ubiquitous and 
histone-gene-specific transcription factors. HiNF-D related pro- 
tein-DNA interactions are also represented in H3 and HI histone 
gene promoters, suggesting the possibility of coordinate transcnp- 
tion factor interactions regulating several histone gene classes. 

Insight into transcriptional control of histone gene expression 
has been provided by identification of modifications in interactions 
of promoter binding factors within the initial thousand base pairs 
of a human H4 histone gene promoter at sites I, II, III, and IV, 
and relating these to the extent of gene transcription. Proiein-DNA 
interactions at these regulatory elements during the cell cycle and 
with the down-regulation of proliferation during differentiation are 
schematically shown in Figure 3. 

See also Chromatin Formation and Structure; Gene 
Expression, Regulation of; Protein Designs for the 
Specific Recognition of DNA. 
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HPLC OF Biological 
Macromolecules 

Karen M. Gooding 
Key Words 

Bonded Phase Organic coating or layer covering the surface of 
the solid HPLC support and containing the functional groups 
responsible for separation. 

Elution Process of a solute passing through and coming out of 

a chromatography column. 
Gradient Systematic variation of the mobile phase composition 

during an HPLC analysis. 

Packing Adsorbent, gel or solid support used in the HPLC 
column. 



High performance liquid chromatography (HPLC) is a high resolu- 
tion separation process using a liquid mobile phase and a column 
containing microparticulate solid particles coated with a specific 
functional group. The functional groups, which can be neutral, 
charged, or hydrophobic, cause separation of components of a 
mixture by the specific physical interaction. The primary modes 
of HPLC for biological macromolecules are reversed phase, ion 
exchange, size exclusion, and hydrophobic interaction chromatog- 
raphy. These rapid and high resolution methods have provided a 
means of purification, separation, and analysis of peptides and 
proteins in biotechnology, microbiology, university, and clinical 
laboratories. 

1 INTRODUCTION 

Liquid chromatography is a separarion process in which the compo- 
nents in a mixture migrate in a liquid stream tlirough a packed bed 
of particles that retard some of the components differentially by a 
specific physical property. The particles that compose the column 
have a uniform physical characteristic, such as hydrophobicity. 



charge, or porosity, which brings about separation by causing mole- 
cules or ions to interact or pass through at different rates. 

Liquid chromatography has been an important method of separat- 
ing and purifying proteins and nucleic acids because these sub- 
stances are soluble and often stable in aqueous buffers. For many 
years, methods utilized columns containing carbohydrate matrices 
that achieved good separations in hours or days; flow rates were 
slow because they were based on gravity. In the mid-1970s, rigid 
packing materials composed of silica or polymer with diameters 
of 5 to 10 p-m were developed to be used with liquids pressured 
to several thousand psi. This technique, initially known as high 
pressure liquid chromatography, is now called high performance 
liquid chromatography (HPLC). Chemical modification of the sur- 
face of the silica or gel support, known as the bonded phase, gives 
the specific basis for retention. 

Columns placed with microparticulate HPLC supports can sepa- 
rate biological macromolecules in minutes with excellent resolution 
and recovery of biological activity. Since the 1970s, both the tech- 
nology of producing HPLC columns and the understanding of their 
operation have improved dramatically, resulting in the widespread 
use of HPLC for protein and peptide analysis and purification. 
Although biological macromolecules include polypeptides, poly- 
nucleotides, and polysaccharides, this entry primarily discusses 
polypeptides because of the vast amount of research on the sub- 
ject. The principles are generally applicable to biomoiecules in all 
three categories. 

2 INSTRUMENTATION 

2.1 General 

A high performance hquid chromatograph consists of one or more 
pumps, a sample injector, a column, a detector, and a data recorder. 
If a single solvent is used as the mobile phase, the method is 
termed isoc ratio. In many cases, more than one solvent must be 
used to release the bound molecules and cause them to elute from 
the column. Multiple solvents are usually combined in a pro- 
grammed gradient from one composition to another. The time and 
variation of composition is called the gradient. Figure 1 illustrates 
the typical configuration of an HPLC with two pumps. 

2.2 Columns 

The column is the key element of the HPLC system. The physical 
process by which the molecules bind will determine which mobile 
phase will promote binding and which will release, thereby causing 
elution. The packing material, or support, in the column is com- 
posed of a rigid material, such as silica or a polymer, which can 
be derivatized or covalently bonded with functional groups; this 
chemical layer is called the bonded phase. HPLC supports usually 
have particle diameters of 5 to 10 ixm and may be porous or non- 
porous. For biological macromolecules, pores must be at least 300 
A in diameter to allow access, whereas small molecules are typi- 
cally run on supports with pores of 80 to TOO A diameters. 

2.3 Detectors 

Detectors for HPLC tend to be selective rather than general. Refrac- 
tive index detectors, which produce a signal for all solutes, are the 
primary devices used, but their sensitivity is low. Light-scattering 
or "mass" detectors are not very sensitive and have nonlinear 
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Presenc-day gibbons survive perfectly well v^ith one adiilt 
j3-like globin gene. 

Pseudogenes have been identified in various other gene 
families, including the tubulin and actin gene families. In 
addition to the complete but nonfunctional gene copies 
that constitute pseudogenes, partial copies of some genes 
have been identified. For example, sequences correspond- 
ing to fragments of the 5' and 3' ends of the tubulin genes 
are quite common in human DNA. These presumably 
arose by unequal crossovers within the tubulin genes, 
rather than in adjacent regions as diagrammed in Figure 
9-6. As discussed in a later section, other nonfunctional 
gene copies can arise by reverse transcription of mRNA 
into cDNA and integration of this intron-less DNA into a 
chromosome. 

rRNAs, tRNAs, and Histones Are Encoded 
by Tandemly Repeated Genes 

The genes for the rRNAs, each type of tRNA, and one 
family of proteins, the histones, which package nuclear 
DNA into chromatin, occur in invertebrates and some ver- 
tebrates as tandemly repeated arrays. These are distin- 



guished from the duplicated genes of gene families in that 
the multiple tandemly repeated genes encode identical or 
nearly identical proteins or functional RNAs. Most often 
copies of a sequence appear one after the other, in a head- 
to-tail fashion, over a long stretch of DNA. Within a tan- 
dem array of rRNA or tRNA genes, each copy is exactly, 
or almost exactly, like all the others. Although the tran- 
scribed portions of rRNA genes are the same in a given 
individual, the nontranscribed spacer regions between the 
transcribed regions can vary. Arrays of tandemly repeated 
histone DNA are somewhat more complex; however, each 
histone gene, too, has multiple identical copies. 

The tandemly repeated rRNA, tRNA, and histone 
genes are needed to meet the great cellular demand for 
their transcripts. Most of the RNA in a cell consists of 
rRNA and tRNA. Assuming RNA polymerase molecules 
move at a fixed speed, there must be a limit to the number 
of RNA copies that transcription of a single gene can pro- 
vide during one cell generation, even if it is fully loaded 
with polymerase molecules. If more RNA is required than 
can be transcribed from one gene, multiple copies of the 
gene are necessary, as illustrated in Figure 9-7 for the syn- 
thesis of pre-RNA, which is processed into 18S, 5.8S, and 
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A FIGURE 9-7 Effect of copy nunnber and loading with 
RNA polynnerase I on rate of synthesis of pre-rRNA in human 
cells. Genes encoding pre-rRNAs, which are processed into 
the 18S, 5.8S and 28S rRNAs, are transcribed by the enzyme 
RNA polynnerase I. Transcription of the pre-rRNA gene by a 
single molecule of RNA polymerase I takes about 5 min. 
(a) If a cell contained one copy of the pre-rRNA gene, which 
was transcribed by one polymerase at a time, it could pro- 
duce a maximum of 288 copies per 24 h. (b) The yield of 
,pre-rRNA from a single copy of the pre-rRNA gene would 



increase substantially if the gene was maximally loaded with 
«250 polymerase molecules, (c) The highest rate of pre- 
rRNA synthesis is possible when a cell contains multiple 
copies of the pre-rRNA and these are transcribed by many 
polymerase molecules at one time. (Duplicate genes are indi- 
cated by small tplue rectangles and polymerases by red cir- 
cles). In order to generate enough rRNA to divide every 24 h, 
human embryonic cells must have at least 100 copies of the 
rRNA gene and these must be near maximally loaded with 
RNA polymerase I. 
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28S rRNA. For example, during early embryonic develop- 
ment in humans, many embryonic cells have a doubling 
time of -24 h and contain 5-10 million ribosomes. To 
produce enough rRNA to form this many ribosomes, an 
embryonic human cell needs at least 100 copies of the pre- 
rRNA gene, and most of these must be close to maximally 
active for the cell to divide every 24 h (see Figure 9-7c), 
The importance of repeated rRNA genes is illustrated by 
Drosophila mutants called "bobbed" (because they have 
stubby wings), which lack a full complement of the tan- 
demly repeated rRNA genes. A bobbed mutation that re- 
duces the number of rRNA genes to less than ^50 is a 
recessive lethal mutation. 

Genes encoding many functional RNAs other than 
mRNA exist in multiple copies in eukaryotic cells (Table 
9-3). All species, including yeasts, contain 100 or more 
copies of the genes encoding 5S rRNA and pre-rRNA. 
More than 20,000 copies of the 5S rRNA gene are present 
in frogs. The copy number for individual tRNA genes 
ranges from 10 to 100. The multiple copies of all the rRNA 
genes occur in tandem arrays. 



TABLE 9-3 Copy Number of Tandemly 
Repeated Genes Encoding Structural RNAs in 
Several Eukaryotes* 



Number of Copies 





Pre-rRNA 


5S-rRNA 


tRNA 


Species 


Gene 


Gene 


Genes^ 


Saccharontyces 








cerevisiae 


140 


140 


250 


Dictyostelium 








discoideum 


180 


180 


? 


Tetrahymena 








pyriformis 








Micronucleus* 


1 


300 


800 


Macronucleus 


200 


300 


800 


Drosophila 








melanogaster 








X chromosome 


250 


165 


860 


Y chromosome 


150 


165 


860 


Xenopus 








laevis 


450 


24,000 


1150 


Human 


«250 


2000 


1300 



*The copy numbers in this table were estimated by 
hybridizing saturating amounts of labeled RNA to DNA. 
^The tRNA numbers include all tRNA sites and therefore 
represent more than 50 different tRNA genes in some 
organisms. Copy numbers for individual tRNAs ranee from 
10-100. 

*The micronucleus is inactive in synthesis of pre-rRNA. 
source: B. Lewin, 1980, Gene Expression, Vol. 2, Wiley, 
p. 876. 



> Discovery of Repetitious 
DNA Fractions 

Besides the duplicated protein-coding genes and the tan- 
demly repeated genes encoding rRNAs, tRNAs, and his- 
tones discussed in the previous section, eukaryotic cells 
contain multiple copies of other DNA sequences in the ge- 
nome. These are generally referred to as repetitious DNA 
(see Table 9-1). Some of these sequences are quite short 
and occur as tandem repeats; others are much longer and 
are interspersed at many places in the genome. The exis- 
tence of these repeated sequences was first recognized in 
reassociation experiments in which denatured eukaryotic 
DNA was observed to renature nonuniformly; that is, 
some of it reassociated much more rapidly than the bulk of 
cellular DNA. Here we briefly review the experimental evi- 
dence that led to discovery of the two major classes of 
repetitious DNA; later, we discuss each class in more 
detail. 



Repeated DNA Reassociates More Rapidly 
Than Nonrepeated DNA 

Suppose that the total DNA of an organism is broken into 
fragments with an average length of about lOOO base pairs. 
The DNA is then melted into single strands and placed 
under conditions that allow strand reassociation to occur 
(e.g., a favorable ion concentration and a favorable tem- 
perature). All the DNA fragments would re-form duplexes 
at about the same speed if none contained sequences that 
were repeated in the genome. However, a segment contain- 
ing a sequence repeated many times in the genome would 
find a complementary partner more quickly than a segment 
with a sequence that occurs only once per haploid genome, 
because the repeated sequence would be present at a much 
higher concentration. Consequently the repeated sequence 
would reassociate faster than the fragment of unique se- 
quence. For this reason, the DNA encoding pre-rRNA and 
that encoding 5S rRNA reassociate faster than does nonre- 
peated DNA. 

The parameters that affect the degree to which single- 
stranded DNA reassociates are its initial concentration and 
the time allowed for the reaction. The Cot of a reaction is 
the product of the concentration of the DNA measured in 
moles of nucleotide per liter Q and the reaction time t in 
seconds. A convenient term for comparing the reassocia- 
tion rates of different DNA fractions is the Cori/2 value— 
the Cj at which one-half of a given fraction renatures. The 
lower the value of Q^i/i, the higher the reassociation rate. 
By comparing the Cj^i value of any particular DNA frac- 
tion with that of a "standard" nucleic acid (e.g., a viral or 
bacterial DNA of known length, both of which have either 
no or very few repetitive sequences), the approximate fre- 
quency of repeats within the fraction of interest can be 
determined. 
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somal DNA, causing it to fold into a more compact struc- 
ture. The most abundant of these proteins, H-NS, is a 
dimer of a 15.6-kDa polypeptide. H-NS binds DNA tightly 
and compacts it considerably, as measured by an increased 
rate of sedimentation during centrifugation and decreased 
viscosity. There are about 20,000 H-NS molecules per cell, 
enough for one H-NS dimer per ^400 base pairs of DNA. 

Finally, E. coli chromosomal DNA is tightly super- 
coiled — that is, twisted upon itself like the circular 




A FIGURE 9-45 Electron micrograph of an isolated 
folded E. CO// chronnosonne. The highly supercoiled DNA is 
attached to a fragment of the cell mennbrane appearing as 
the nnost darkly staining material in the micrograph. Although 
the highly supercoiled nature of the £ coli chromosome is 
illustrated by this electron micrograph, the chromosome actu- 
ally decondensed considerably during isolation. Within the 
cell, the chromosome has a diameter of <1 yivc\. [From H. 
Delius and A. Worcel, 1974, J. Mol. Biol. 82:107.) 



SV40 DNA shown in Figure 4-14. As discussed in Chapter 
10, an E, coli enzyme called DNA gyrase can introduce 
negative supercoils into DNA. Supercoiling contributes to 
the compaction necessary to fit chromosomal DNA into 
the bacterial cell. Figure 9-45 is an electron micrograph of 
an isolated, highly supercoiled £. coli chromosome at- 
tached to a fragment of cell membrane. 

Eukaryotic Nuclear DNA Associates with 
Highly Conserved Histone Proteins to Form 
Chromatin 

The problem of compacting cellular DNA is also signifi- 
cant for eukaryotic cells. When the DNA from eukaryotic 
nuclei is isolated in isotonic buffers (i.e., buffers with the 
same salt concentration found in cells, ===0.15 M KCl), it is 
found associated with an equal mass of protein in a highly 
compacted complex called chromatin. The general struc- 
ture of chromatin has been found to be remarkably similar 
in all eukaryotic cells. 

The most abundant proteins associated with eukary- 
otic DNA are histones, a family of basic proteins found in 
all eukaryotic nuclei. The five niajor types of histone pro- 
teins—termed HI, H2A, H2B, H3, and H4— are easily 
separated by gel electrophoresis (Figure 9-46). The histone 
proteins are rich in basic amino acids, which contact nega- 
tively charged phosphate groups in DNA. In a fraction of 
the histone proteins of most cells, some of the basic amino 
acid side chains are modified by post-translational addition 
of methyl, acetyl (CH3CO-), or phosphate groups, neu- 
tralizing the positive charge of the side chain or converting 
it to a negative charge. 



FIGURE 9-46 Gel electropho- 
retic separation of histone proteins 
extracted from chicken blood cells. The 
major histone species— H2A. H2B, H3, 
and H4 — are present in about equal 
amount. The other major histones are 
HI, which is found in white blood cells 
and most other vertebrate cells, and 
H5, which is similar to HI and replaces 
it in the red blood cells of birds. The 
separation of HI into three bands 
results from differences in the extent 
of phosphorylation of residues in the 
protein. (Courtesy of V. Allfrey.l 
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(a) 




A FIGURE 9-47 Electron micrographs of extracted chro- 
matin in extended and condensed forms, (a) Chromatin iso- 
lated in low ionic strength buffer has an extended "beads- 
on-a-string" appearance. The "beads" are nucleosomes (10- 
nm diameter) and the "string" is connecting DNA. (b) Chro- 



(b) 




matin isolated in buffer with a physiologic ionic strength 
(0.15 M KCI) appears as a condensed fiber 30 nm in diame- 
ter. (Left micrograph courtesy of S. McKnight and 0. Miller, 
Jr.; right micrograph courtesy of B.-Hamkalo and J. B. 
RattnerJ 



The amino acid sequences of four histones (H2A, H2B, 
H3, and H4) from a wide variety of organisms are remark- 
ably similar among distantly related species. For example, 
the sequences of histone H3 from sea urchin tissue and of 
H3 from calf thymus are identical except for a single amino 
acid, and only four amino acids are different in H3 from 
the garden pea and that from calf thymus. Minor histone 
variants encoded by genes that differ from the highly con- 
served major types also exist, particularly in vertebrates. 

The amino acid sequence of HI varies more from or- 
ganism to organism than do the sequences of the other 
major histones. In certain tissues, HI is replaced by special 
histones. For example, in the nucleated red blood cells of 
birds, a histone termed H5 is present instead of HI (see 
Figure 9-46). Despite minor variations, the similarity in the 
amino acid sequences of the major histones among all eu- 
karyotes is most impressive. 

Chromatin Exists in Extended and 
Condensed Forms 

When chromatin is extracted from nuclei and examined in 
the electron microscope, its appearance depends on the salt 
concentration to which it is exposed. At low salt concen- 
tration, isolated chromatin resembles "beads on a string" 
(Figure 9-47a). In this extended form, the string is a thin 
filament of DNA connecting the beadlike structures termed 
nucleosomes. Composed of DNA and histones, nucleo- 
somes are about 10 nm in diameter and are the primary 
structural units of chromatin. If chromatin is isolated at 
physiologic sah concentration (^0,15 M KCI), it assumes 
a more condensed fiber-like form 30 nm in diameter (Fig- 
ure 9-47b). 

Structure of Nucleosomes Individual nucleosomes 
can be isolated by nuclease digestion of extracted chroma- 
tin, because the DNA component of nucleosomes is much 



less susceptible to digestion than is the linker DNA con- 
necting nucleosomes. Partial nuclease treatment first re- 
leases groups of nucleosomes by digestion of the linker 
DNA between some of the nucleosomes. More extensive 
digestion produces nucleosome tetramers, trimers, and 
dimers. Eventually, nuclease treatment digests all the DNA 
between individual nucleosomes, so that all the nucleo- 
somes are released. The DNA content pf a single nucleo- 
some plus the DNA linking neighboring nucleosomes var- 
ies between 160 and 200 base pairs in different organisms. 
After digestion of all the Hnker DNA, nucleosomes from all 
eukaryotes contain close to 146 base pairs of DNA. 

A nucleosome is composed of a protein core with DNA 
wound around its surface like thread around a spool. The 
core is an octamer containing two copies each of histones 
H2A, H2R, H3', and H4. X-ray crystallography has shown 
that the octameric histone core is disk shaped (Figure 
9-48). About 146 base pairs of DNA are wrapped slightly 
less than two turns around the core to form the 
nucleosome. 

Assembly of Nucleosomes Newly replicated DNA 
quickly associates with already formed histone octamers. A 
model of nucleosome assembly has been proposed based 
on studies with rapidly dividing ferdlized frog oocytes. 
Analysis of protein complexes isolated from early frog 
embryos revealed two acidic nonhistone proteins associ- 
ated with the basic histone proteins that were not assem- 
bled into nucleosomes. One of these nonhistone proteins, 
called nucleoplasmin, was found bound to H2A and H2B; 
the other, called Nl protein, to H3 and H4. When partially 
purified preparations of these two complexes were mixed 
in the presence of DNA, nucleosomes were formed with 
release of free nucleoplasmin and Nl (Figure 9-49). Pro- 
teins resembling nucleoplasmin and Nl have been identi- 
fied in other cell types. Thus, the proposed pathway of 
nucleosome assembly may operate in most cells. 



348 Chapter 9 The Molecular Anatomy of Genes and Chromosomes 



(a) 



(b) 




A FIGURE 9-48 Structure of the histone octamer and 
the nucleosome. (a) Model of octameric histone core based 
on a 3.1 A resolution structure determined by x-ray crystal- 
lography. The histone core contains two copies each of H2A 
(tight blue), H2B (dark blue); H3 (green), and H4 (white). The 
spheres represent amino acid residues, not atoms. The posi- 
tively charged arginine and lysine residues are red. The 
amino termini of the histone proteins are not visualized by 
x-ray crystallography, but they are thought to extend outward 
from the top and bottom of this view of the histone 
octameric core, (b) Model of the. nucleosome in which the 
octameric core is represented by one centrally located 



(H3/H4)2 tetramer (white) flanked by two H2A/H2B dimers 
(blue); 146 base pairs of DNA (gray) are wrapped 1.75 super- 
coil turns around the-histone core. In the central area of the 
picture, the DNA bases (light gray) have been stripped away 
and the path of the phosphodiester backbones is repre- 
sented by medium arid dark gray spheres; these are "under- 
sized" in order to visualize the matching of the pattern of 
the positively charged residues (red spheres) on the surface 
of the histone octamer with the negatively charged DNA 
backbone. The a-helix dipoles are indicated by orange. [See 
G. Arents and E. N. Moudrianakis, 1993, Proc. Natl. Acad. 
Sci 90:10489; courtesy of E. N. Moudrianakis. I 



Solenoid Structure of Condensed Chromatin In its 
condensed form, chromatin appears as fibers —30 nm in 
diameter (see Figure 9-47b). A model for the structure of 
these thick fibers is shovi^n in Figure 9-50. In this model, 
nucleosomes are packed into a spiral or solenoid arrange- 
ment with six nucleosomes per turn. A fifth histone, HI, is 
bound to the DNA on the inside of the solenoid, with one 
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HI molecule associated with each nucleosome. The unit of 
one nucleosome plus one bound HI is referred to as a chro- 
matosome. Under various- conditions, condensed chroma- 
tin is further folded into giant supercoiled loops. 

As noted earlier, when chromatin is extracted at the 
physiologic salt concentration, condensed 30-nm solenoid 
fibers are obtained. However, when extraction is done at a 
low salt concentration, HI is released, yielding the ex- 
tended beads-on-a-string form. Thus, depending on the 
extraction conditions, two forms of chromatin can be ob- 
served experimentally in vitro. As discussed in the next sec- 
tion, the chromatin in chromosomal regions that are not 
being transcribed exists predominantly in the condensed 
form, whereas that in regions being transcribed probably 
assumes the extended form. 



< FIGURE 9-49 Proposed pathway of nucleosome 
assembly in frog eggs. Both N1 and nucleoplasmin are acidic 
proteins that have been shown to associate with histones as 
indicated. [Adapted from S. M. .Dilworth et al., 1987, Ceil 
51:1009.1 
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tance to enzymatic degradation (31,32) because of their reduced conformational 
flexibility. In addition, cyclic peptides have been used for the construction of con- 
formationally defined templates (33). Therefore, this laboratory has prepared a 
positional scanning cyclic template combinatorial library in which the active com- 
pounds were found to be stable to proteolytic enzymes (34). 

To circumvent the potential therapeutic limitations relevant to the active com- 
pounds found in the L-amino acid libraries, libraries consisting of D- and/or unnatu- 
ral amino acid peptides have been used to identify^active compounds having much 
greater enzymatic stability (35). 

11.4.2 Peptidomimetic Soluble Combinatorial Libraries 

The preparation of libraries of oligomeric N-alkylated glycines (13,14), termed 
peptoids, was the first report of the generation of peptidomimetic libraries. Favor- 
able changes in the physical and chemical properties of the peptidomimetic com- 
pounds relative to peptides, such as enhanced resistance to proteolytic enzymes, 
increased acid stability, favorable aqueous-organic panitioning characteristics, and 
so forth, are possible with such libraries. 

A simpler approach, which greatly expands the diversity of combinatorial li- 
braries, termed the '"libraries from libraries'' concept, has been developed in our 
laboratory (15). With this concept, an existing peptide library was exhaustively 
permethylated while still attached to the solid support used in its synthesis. Since 
this approach is based on the transformation of a well-defined peptide combinatorial 
library, and since the chemical transformation is performed using solid-phase meth- 
ods, equimoiarity of the compounds within the peptidomimetic library is easily 
ensured. A range of chemical transformations can be envisioned to generate a 
number of peptidomimetic libraries. Thus, a number of peptide libraries, such as 
those described in Figure 11.7. have been peralkyiated using a variety of alkylating 
agents, including methyl iodide, allyl bromide, and benzyl bromide (36). An exam- 
ple of the chemical structure of one of these peralkyiated libraries composed of 
permethylated hexapeptides is shown in Figure 11.8. The effect of these modifica- 
tions is that the resulting compounds have very different physical, chemical, and 
biological propenies than their parent compounds. The screening of each peralkv- 
lated library-' in various bioassays led to the identification of highly active com- 
pounds derived from completely different parent peptides. 

An illustration of the utility of such libraries is presented in Figure 1 1 .9. in which 
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Figure 11.8 N-permethylaied hexapeptide combinatorial librar>'. represents the side 
chains of a mixture of the 20 proteogenic amino acids. The side chains of C. D. E, H. K, N. 
0- R. and Y have also been modified. 
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a permethylated positional scanning hexamer library was screened in a standard 
microdilutiori assay to identify individual permethylated compounds having potent 
antimicrobial activity against Staphylococcus aureus. Using the structural informa- 
tion from the most active of the 120 permethylated mixtures in this librar>\ 72 
individual peptides were synthesized, permethylated, and cleaved. The permethy- 
lated form of LFIFFF-NH2 was found to be the most active compound (IC50 = 6 
pLo/mL and MIC = 11 to 15 pig/mL, where the IC50 and MIC values represent the 
co'ncentrations necessary to inhibit 50 and 100% cell growth, respectively). These 
compounds showed similar activities against a methicillin-resistant strain of 
S. aureus. 



11.4.3 Organic Chemical Libraries 

Organic chemical libraries fall into two categories: polymer based and nonpolymer 
based. In the first category, the synthesis of a small library of oligocarbamates (256 
discrete compounds) and its screening against a monoclonal antibody have been 
reported (37). In our laboratory, polymer-based organic chemical libraries of large 
diversity have been synthesized using the libraries from libraries approach. The 
initial application of this concept to form organic libraries was through the genera- 
tion of a library of substituted polyamines (34 million) (16). To generate the librar\'. 
a well-characterized hexapeptide library was exhaustively reduced to generate mil- 
lions of substituted polyamines. This library was found to have substantial activity 
in both receptor-binding and antimicrobial microdilution assays. Related poiyamine 
libraries have also been synthesized from the exhaustive reduction of peralkylated 
libraries. Current projects in our laboratory involve -the extension of the libraries 
from libraries concept to form libraries of hydroxylamines, nitrosamines. hy- 
drazines, and so forth. 

Advances in the application of chemical reactions to the solid phase initially led 
to the multiple synthesis (<200 compounds) of discrete nonoligomeric compounds. 
The synthesis of benzodiazepines (192 compounds) on plastic pins using the micro- 
titer plate format has been reported (17.38). as well as the related syntheses of 
benzodiazepines and hydahioins (40 compounds) using fritted glass chambers (39). 
It should be noted that, in each case, these compounds were prepared as discrete 
products, eliminating the productivity advantage of combinatorial libraries during 
the assay portion of the process. Tlie feasibility of using pooled combinatorial 
chemical libraries was first validated by the synthesis of mixtures of (3-mercapto- 
ketones (9 compounds) (40) and the synthesis and screening of potential antioxi- 
dants (27 compounds) (41). However. vaHdation of the abilir>' of individual assays 
to distinguish between compounds having the potential for multiple opposing prop- 
enies remains to be proven when using combinatorial mixtures containing a large 
diversity of nonpolymeric compounds. The screening of a librarv' consisting of 7600 
acvlated and alkylated amino acids, which yielded compounds with an affinity for 
sireptavidin. has'been reponed (42). Similar strategies using diverse chemical reac- 
tions such as alkylations. acylations. reductions, and oxidations have been used in 
our laborator>' to'sequentially generate large combinatorial libraries (> 10.000 com- 
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1 INTRODUCTION 

By historical imperative, the role of molec- 
ular modeling in drug design has been 
divided mto two separate paradigms: one 
centered on the structure-activity problem, 
which attempts to rationalize biological 
activity in the absence of detailed, three- 
dimensional structural information about 
the receptor, and the other focused on 
understanding the interactions seen in re- 
ceptor-ligand complexes, which uses the 
known three-dimensional structure of the 
therapeutic target to design novel drugs. . 
The rapid increase in relevant structural 



information, as a result of advances in 
molecular biology that is used to generate 
the target proteins in adequate quantities 
for study, and the equally impressive gains 
ill NMR (1-3) and crystallography that 
provide three-dimensional structures have 
stimulated the need for design tools, and 
the molecular modeling community is 
rapidly evolving useful approaches. The 
more common problem, however, is one in 
which the receptor can only be inferred 
from pharmacological studies and little, if 
any, structural information is available to 
guide modeling. Nevertheless, useful infor- 
mation that can guide the design and syn- 



4 Unknown Receptors 

etc.) as the macromolecular component, 
i.e., binding site, of recognition of bio- 
logically active small molecules. 



4.1 Pharmacophore Versus Binding-site 
Models 

4.1.1 PHARMACOPHORE MODELS. It iS OftCn 

useful to assume that the receptor site is 
rigid and that structurally different drugs 
bind in conformations that present a similar 
steric and electronic pattern, the phar- 
macophore. Most drugs, because of inher- 
ent conformational freedom , are capable of 
presenting a multitude of three-dimensional 
patterns to a receptor. This pharmaco- 
phoric assumption leads to a problem state- 
ment that logically is composed of two 
processes. The first is the determination, by 
chemical modification and biological test- 
ing, of the relative importance of different 
functional groups in the drug to receptor 
recognition. This can give some indication 
of the nature of the functional groups in the 
receptor that are responsible for binding 
the set of drugs. Second, a hypothesis is 
proposed (Fig. 15.27) concerning corre- 
spondence, either between functional 
groups (pharmacophore) in different 
congeneric series of the drug or between 
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recognition site points postulated to exist 
within the receptor (binding-site model). 

The intellectual framework for using 
structure-activity data to extrapolate in- 
formation regarding the ligand's partner 
(the receptor) is the concept of the phar- 
macophore. The pharmacophore, a concept 
introduced by Ehrlich at the turn of the 
century, is the critical three-dimensional 
arrangement of molecular fragments (or the 
distribution of electron density) that is 
recognized by the receptor and, in the case 
of agonists, that causes subsequent activa- 
tion of the receptor on binding. In other 
words, some parts of the molecule are 
essential for interaction, and they must be 
capable of assuming a particular three-di- 
mensional pattern that is complementary to 
the receptor to interact favorably. One 
corollary of the pharmacophoric concept is 
the ability to replace the chemical scaffold 
holding the pharmacophoric groups with 
retention of activity. This is the basis of the 
current activity in peptidomimetics in which 
the amide backbone of peptides has been 
replaced by sugar rings, steroids (249, 250), 
benzodiazepines (251), or carbocycles 
(252,253) (Fig. 15.28). In the phar- 
macophoric hypothesis, physical overlap of 
similar functional groups is assumed, i.e., 
the carboxyl group from compound A 
physically overlaps with the corresponding 




X— A 




X— A 



Fig. 15.27 (a) Pharmacophore hypothesis with correspondence of functional groups in drugs, A - A', B B', 
C= C. (fe) Binding-site hypothesis using drugs with hypothetical binding sites attached .(X, Y, and Z overlap). 
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Fig. 15.28 Peptidomimetics that have been designed based on iterative introduction of constraints into parent 
peptide and hypotheses concerning receptor-bound conformation. Enkephalin mimetic (254), RGD platelet 
GPIIb/IIla receptor antagonists (250, 251), thyroiiberin (TRH) (253), and somatostatin (249,255). 
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carboxyl group from compound B and with 
the bioisosteric tetrazole ring of compound 
C. 

One caveat that must be remembered is 
the probability of alternate or multiple 
binding modes. The interaction of a ligand 
with a binding site depends on the free 
energy of binding, a complex interaction 
with both en tropic and enthalpic compo- 
nents. Simple modifications in structure 
may favor one of several nearly energetical- 
ly equivalent modes of interaction with the 
receptor and change the correspondence 
between functional groups that has previ- 
ously been assumed and supported by ex- 
perimental data. Changes in the binding 
mode of an antibody Fab fragment to 
progesterone and its analogs has been 
shown by crystallography of the complexes 
(256,257). For this reason, analysis of 
agonists as a class is usually preferred, as 
the necessity to both bind and trigger a 
subsequent transduction event is more re- 
strictive than the simple requirement for 
binding shared by antagonists (235). Com- 
pounds that clearly are inconsistent with 
models derived from large amounts of 
structure-activity data may be indicative of 
such changes in binding mode and may 
require a separate structure-activity study 
to characterize their interaction. 

4.1.2 BJNDiNG-siTE MODELS. One majpr 
deficiency in the approach described above 
is the requirement for overlap of functional 
groups in accord with the pharmacophoric 
hypothesis. While it is true that molecules 
having functional groups that show three- 
dimensional correspondence can interact 
with the same site, it is also true that a 
particular geometry associated with one site 
is capable of interacting with equal affinity 
with a variety of orientations of the same 
functional groups. One has only to consider 
the cone of nearly equal energetic arrange- 
ments of a hydrogen-bond donor and ac- 
ceptor to realize the problem. Sufficient 
examples from crystal structures of drug- 



enzyme complexes and from theoretical 
simulation of binding compel the realiza- 
tion that the pharmacophore is a limiting 
assumption. Clearly, the observed binding 
mode in a complex represents the optimal 
position of the ligand in an asymmetric 
force field created by the receptor that is 
subject to perturbation from solvation and 
entropic considerations. Less restrictive is 
the assumption that the receptor-binding 
site remains relatively fixed in geometry 
when binding the series of compounds 
under study. Experimental support for such 
a hypothesis can be found in crystal struc- 
tures of enzyme-inhibitor complexes in 
which the enzyme presents essentially the 
same conformation, despite large variations 
in inhibitor structures; studies of HIV-1 
protease complexed with diverse inhibitors 
support this view (137). In recent years, 
therefore, there has been an increasing 
effort to focus on the groups of the re- 
ceptor that interact with ligands as being 
the common features for recognition of a 
set of analogs. When pharmacophore and 
binding-site hypotheses are compared, the 
binding-site model is physicochemically 
more plausible, because overlap of func- 
tional groups in binding to a receptor is 
more restrictive than assuming the site 
remains relatively fixed when binding dif- 
ferent hgands. However, the number of 
degrees of freedom in binding-site hypoth- 
eses (represented by the necessary addition 
of virtual bonds between groups A and X, 
B and Y, and C and Z in Figure 15.27) is 
greater. Additional degrees of freedom 
complicate subsequent conformational 
analyses and may preclude any conclusions, 
unless a sufficiently diverse set of com- 
pounds is available. 

Other approaches to this problem have 
emphasized comparison of molecular prop- 
erties rather than atom correspondences. 
Kato et al. (258) developed a program that 
allows construction of a receptor cavity 
around . a molecule, emphasizing the 
electrostatic and hydrogen-bonding capa- 
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bilities. Other molecules can then be fit 
within the cavity to align them. This is 
similar in concept to the field-fit techniques 
available in the CoMFA module of 
SYBYL, in which the molecular field 
(electrostatic and steric) surrounding a se- 
lected molecule becomes the objective 
criterion for alignment of subsequent mole- 
cules for analysis. An example emphasizing 
molecular properties in pharmacophoric 
analysis has been given on inhibitors of 
cAMP phosphodiesterase II (259). 

4.1.3 MOLECULAR EXTENSIONS. If OnC aS- 

sumes the binding-site points remain fixed 
and can augment the drug with appropriate 
molecular extensions that include the bind- 
ing site (e.g., a hydrogen-bond donor cor- 
rectly positioned next to an acceptor), one 
can then examine the set of possible 
geometric orientations of site points to see 
if one is capable of binding all the ligands. 
Here, the basic assumption of rigid site 
points is more reasonable, at least for 
enzymes that have evolved to catalyze 
reactions and must, therefore, position 
critical groups in a specific three-dimen- 
sional arrangement to create the correct 
electronic environment for catalysis. The 
program checks this hypothesis by deter- 
mining if one or more geometrical arrange- 
ments of the postulated groups of site 
points are common to the set of active 
compounds. Such a geometrical arrange- 
ment of receptor groups becomes a candi- 



date binding-site model, which can be 
evaluated for predictive merit. 

In a study of the active site of angioten- 
sin-converting enzyme (ACE) (260), this 
binding site model was used by incorporat- 
ing the active site components as parts of 
each compound undergoing analysis. As an 
example, the sulfhydryl portion of captopril 
was extended to include a zinc bound at the. 
experimentally optimal bond length and 
bond angle for zinc-sulfur complexes (Fig. 
15.29). The orientation map (OMAP), 
which is a multidimensional representation 
of the interatomic distances between phar- 
macophoric groups (Fig. 15.30), was based 
, on the distances between binding-site 
points such as the zinc atom with the 
introduction of more degrees of torsional 
freedom to accommodate the possible posi- 
tioning of the zinc relative to ACE in- 
hibitors such as captopril (262). Analyses 
of nearly 30 different chemical classes (Fig. 
15.31) of ACE inhibitors led to a unique 
arrangement of the components of - the 
active site postulated to be responsible for 
binding the inhibitors. The displacement of 
the zinc atom in ACE to a location more 
distant from the carboxyl-binding Arg seen 
in carboxypeptidase A is compatible with 
the fact that ACE cleaves dipeptides from 
the C-terminus of peptides whereas car- 
boxypeptidase A cleaves single amino acid 
residues. 

Visualization of the OMAP is useful to 
judge the additional information intro- 
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Fig. 15.29 Extension of sulfhydryl group of captopril to include postulated active site zinc, using optimal bond 
length and angles (260,261). 
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Fig. 15.30 Distances used in five-dimensional OMAP 
for analysis of ACE inhibitors (260). 



duced as each new compound is added 
(Fig. 15.32). Computationally, it is much 
more efficient to treat the set of non- 
congeneric compounds simultaneously 
(77,263), but it is reassuring when identical 
results are obtained if one uses the sequen- 
tial procedure, introducing each molecule 
in turn, so that intermediate results may be 
visually verified. The use of computer 
graphics to confirm intermediate processing 
of data in convenient display modes be- 
comes increasingly more important as the 
individual computations and numbers of 
molecules under consideration increase. 

4.1.4 ACTIVITY VERSUS AFFINm^ Given a 
consistent model of either type, a limitation 
is that one can only ask if the compound 
under consideration can present the three- 
dimensional electronic pattern (phar- 
macophore) that is the current candidate. 
In other words, one is limited to predicting 
the presence or absence of activity, a bina- 
ry choice. Even the presence of the appro- 
priate pattern is insufficient to ensure bio- 
logical activity. For example, competition 



with the receptor for occupied space by 
other parts of the molecule can inhibit 
binding and preclude activity. One can, 
therefore, postulate the following condi- 
tions for activity: 

1. The compound must be metabolically 
stable and capable of transport to the 
site for receptor interaction (interpreta- 
tion of inactive compounds may be 
flawed by problems with bioavail- 
ability). 

2. The compound must be capable of as- 
suming a conformation which will pres- 
ent the pharmacophoric or binding-site 
pattern complementary to that of the 
receptor. 

3. The compound must not compete with 
the receptor for space while presenting 
the pharmacophoric or binding-site pat- 
tern. 

Once these conditions are met, one can 
attempt to deal with the potency, or bind- 
ing affinity. This belongs to the domain of 
three-dimensional quantitative structure- 
activity relationships (3D-QSAR) (264); 
the use of the variant CoMFA (148,265) 
on ACE inhibitors will be illustrated at the 
end of this chapter. Condition 3 allows one 
to use compounds that are capable of 
presenting the pharmacophoric pattern but 
incapable of binding to help determine the 
location of receptor-occupied space in rela- 
tion to the pharmacophore (receptor map- 
ping) (266). This allows a crude, low res- 
olution map of the position of the receptor 
relative to the pharmacophoric elements 
and indicates in which directions chemical 
modifications may be productive. 

The number and diversity of compounds 
available for analysis determines the meth- 
odology to be used. If there is a limited 
data set, then the pharmacophoric ap- 
proach should be assessed first, due to its 
fewer degrees of freedom. If no phar- 
macophoric patterns are consistent with the 
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Fig. 15.31 Compounds from different chemical classes of ACE inhibitors used in active site analysis. 
From Ref. 260. 
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Fig 15.32 Change in OMAP (projection of three of the five dimensions) as new compounds were introduced to 
ana ys. of ACE inhibitors (260). The original OMAP of compound 1 (see Fig. 15.31) ^ toTe 1"^^^^^^^ 
OMAP after completion of analysis is to the right. ' ^ 



set of analogs, then introduction of logical 
molecular extensions to enable the active 
site approach is warranted. Operationally, 
one first determines the set of potential 
pharmacophoric patterns consistent with 
the set of active analogs, leading to its 
name: active analog approach (262). If 
there are sufficient data, then a unique 
pharmacophore, or active site model, may 
be identifiable. The basic assumption 
behind efforts to infer properties of the 
receptor from a study of structure-activity 
relations of drugs that bind is the idea of 
complementarity. It follows that the 
stronger the binding affinity, the more 
likely that the drug fits the receptor cavity 
and aligns those functional groups that 
have specific interactions in a way com- 
plementary to those of the receptor itself. 
Certainly, our understanding of inter- 
molecular interactions from studies of 
known complexes do not dissuade us of this 
notion but may make us somewhat skepti- 
cal of the naive models that often result 
from such efforts. Andrews et al. (267) 
have reviewed efforts of this type with 
regard to CNS drugs. 

Clearly, the key to insight relies on 
chemical modification to determine the 



relative importance of functional groups for 
molecular recognition. Often more subtle 
effects than the simple presence or absence 
of a group are important, and then com- 
parison of molecular properties becomes of 
interest. A major impediment to analysis is 
the definition of a common frame of refer- 
ence by which to align molecules for com- 
parison; This is equivalent to solving the 
three-dimonsional pharmacophoric pattern 
and. implies that one has distinguished 
those properties of the molecules under 
consideration in a manner similar to the 
receptor. Initial efforts to rationalize struc- 
ture-activity relationships (SAR) among 
noncongeneric systems was hampered by 
an "RMS mentahty," i.e., a point of view 
that required atomic centers to align rather 
than to overlap with steric and electronical- 
ly similar groupings. An example would be 
requiring the six atoms of aromatic benzene 
rings to overlap at each of the six atoms of 
the ring vertices rather than the simple 
requirements for coincidence and copla- 
narity, which would recognize the torus of 
electron density that the rings share in 
common (Fig. 15.33). In congeneric series, 
the difficulties in assignment of corre- 
spondence is less (nonexistent by defini- 
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Fig. 15.33 Torus of electron density representing a benzene ring. Atom-to-atom correspondences of ring atoms 
used in normal fitting routines lead to overconstrained fits. 



tion). This allows a variety of approaches, 
including those based on molecular graph 
theory (268-271), to detect similarities 
between molecules, which can form the 
basis of a correlation analysis. Extrapola- 
tion outside of the group of congenerically 
related compounds on which the analysis 
was based would appear difficult, if not 
impossible. 

While it is simpler to start an analysis 
with a congeneric series to identify the 
recognition elements, diversity in chemical 
structures implies more information regard- 
ing the conformational requirements of the 
system. A congeneric series requires that 
the basic chemical framework of the mole- 
cule remains constant and that groups on 
the periphery are either modified (e.g., 
aromatic substitution) or substituted (e.g., 
tetrazole for carboxyl functional group). 
Implicit in this concept is the notion that 
the compounds bind to the receptor in a 
similar fashion, and therefore, the changes 
are localized and comparable for each 
position of modification. Introduction of 
degrees of freedom in the substituents and 
consideration of differences in properties 
that are conformationally dependent, such 



as the electric field, require conformational 
analysis in an effort to determine the rel- 
evant conformation for comparison. 

The problem can be divided into two; 
what are the aspects of the molecules that 
are in common and that may provide the 
basis for molecular recognition, and which 
conformation for each molecule is appro- 
priate to consider? For the first problem, 
studies on a congeneric series can often 
yield valuable insight. For determination of 
the three-dimensional arrangement of the 
crucial recognition elements, diversity in 
the chemical scaffolds imposes different 
constraints on possible three-dimensional 
patterns and generates an opportunity for 
determining a unique solution. 

4.2 Searching for Similarity 

4.2.1 SIMPLE COMPARISONS. To gain in- 
sight into molecular recognition, subtle 
differences in molecules must be perceived. 
Comparisons can be divided into two 
categories: those that are independent of 
the orientation and position of the mole- 
cule and those that depend on a known 



